In [1]:
import pandas as pd
import numpy as np
import sklearn.metrics

Read in labels and performance data:

In [2]:
data = pd.read_csv('labels_and_performance.csv')
data.head()

Unnamed: 0,Pneumonia,Atelectasis,Effusion,Pneumothorax,Infiltration,Cardiomegaly,Mass,Nodule,algorithm_output
0,1,1,1,0,0,0,0,0,1
1,1,1,0,0,0,1,0,0,1
2,1,0,1,0,0,0,0,0,1
3,1,1,1,0,0,1,0,0,1
4,0,1,0,0,0,0,0,0,0


First, look at the overall performance of the algorithm for the detection of pneumonia:

In [3]:
tn, fp, fn, tp = sklearn.metrics.confusion_matrix(data.Pneumonia.values,
                                                  data.algorithm_output.values,labels=[1,0]).ravel()

In [4]:
sens = tp/(tp+fn)
sens

0.8235294117647058

In [5]:
spec = tn/(tn+fp)
spec

0.8166666666666667

Now, look at the algorithm's performance in the presence of the other diseases: 

In [6]:
for i in ['Atelectasis','Effusion','Pneumothorax','Infiltration','Cardiomegaly','Mass','Nodule']:

    tn, fp, fn, tp = sklearn.metrics.confusion_matrix(data[data[i]==1].Pneumonia.values,
                                                  data[data[i]==1].algorithm_output.values,labels=[1,0]).ravel()
    sens = tp/(tp+fn)
    spec = tn/(tn+fp)

    print(i)
    print('Sensitivity: '+ str(sens))
    print('Specificity: ' +str(spec))
    print()

Atelectasis
Sensitivity: 0.8333333333333334
Specificity: 0.782608695652174

Effusion
Sensitivity: 0.8571428571428571
Specificity: 0.6521739130434783

Pneumothorax
Sensitivity: 0.6666666666666666
Specificity: 0.8571428571428571

Infiltration
Sensitivity: 0.0
Specificity: 0.3888888888888889

Cardiomegaly
Sensitivity: 1.0
Specificity: 0.8888888888888888

Mass
Sensitivity: 0.8666666666666667
Specificity: 0.9285714285714286

Nodule
Sensitivity: 0.5384615384615384
Specificity: 1.0



### Statement on algorithmic limitations:

The results above indicate that the presence of infiltrations in a chest x-ray is a limitation of this algorithm, and that the algorithm performs very poorly on the accurate detection of pneumonia in the presence of infiltration. The presence of nodules and pneumothorax have a slight impact on the algorithm's sensitivity and may reduce the ability to detect pneumonia, while the presence of effusion has a slight impact on specificity and may increase the number of false positive pneumonia classifications.